Lightweight 3D Human Pose Estimation Network Training Using Teacher-Student Learning
Published:
Review of paper named ‘Lightweight 3D Human Pose Estimation Network Training Using Teacher-Student Learning’
This paper presenting MoVNect which is a lightweight deep neural network(DNN) with teacher-student learning to capture 3D human pose in mobile devices.
There are several way to lightwieghting the models. Most of the neural network such as ResNet, DenseNet, and GoogleNet have so many layers contained itself, so it takes alot of time and calculation for each processes. This large amount of model size is not compatible to real world.
Handling jitter problem
By adding 1 euro filter
- Neural Network Pruning After finish connecting parameters between nodes, some nodes are definitely less important than other major nodes. We neglect or delete these less important connections between nodes to lightning the model. However, this process will decrease the accuracy of model. It needs extra training after deletion of model connection to enhancement.
- Low Rank Approximation Singular Vector Decomposition, Filter-Bank
- Quantization Normally all the parameters are 32bit or 64bit for default. However, some of parameters are not requiring precise precision number. 16, 8, 4 bit precision still shows good performance contrast to original precision. Additionally on the parallel computing depends on hardware performance even gets higher.
- Knowledge-Distillation
Basic Architecture of CNN
This architecture shows diverse concepts
- Convolution + ReLu : Activation function with diver convolution
- Pooling : Max Pooling, Average Pooling
- Fullpy Connected Layers
Reference
https://arxiv.org/pdf/2001.05097.pdf
https://post.naver.com/viewer/postView.nhn?volumeNo=20748771&memberNo=36733075
https://arxiv.org/pdf/1705.01583.pdf
https://www.youtube.com/watch?v=7UoOFKcyIvM&feature=youtu.be
https://gaussian37.github.io/dl-concept-mobilenet_v2/
https://hal.inria.fr/hal-00670496/document
Github